Average cost Markov control processes with weighted norms: value iteration

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

E. GORDIENKO and O. HERNÁNDEZ-LERMA (México) AVERAGE COST MARKOV CONTROL PROCESSES WITH WEIGHTED NORMS: VALUE ITERATION

This paper shows the convergence of the value iteration (or successive approximations) algorithm for average cost (AC) Markov control processes on Borel spaces, with possibly unbounded cost, under appropriate hypotheses on weighted norms for the cost function and the transition law. It is also shown that the aforementioned convergence implies strong forms of AC-optimality and the existence of f...

متن کامل

E. GORDIENKO and O. HERNÁNDEZ-LERMA (México) AVERAGE COST MARKOV CONTROL PROCESSES WITH WEIGHTED NORMS: EXISTENCE OF CANONICAL POLICIES

This paper considers discrete-time Markov control processes on Borel spaces, with possibly unbounded costs, and the long run average cost (AC) criterion. Under appropriate hypotheses on weighted norms for the cost function and the transition law, the existence of solutions to the average cost optimality inequality and the average cost optimality equation are shown, which in turn yield the exist...

متن کامل

l AVERAGE COST SEMI - MARKOV DECISION PROCESSES

^ The Semi-Markov Decision model is considered under the criterion of long-run average cost. A new criterion, which for any policy considers the limit of the expected cost Incurred during the first n transitions divided by the expected length of the first n transitions, is considered. Conditions guaranteeing that an optimal stationary (nonrandomized) policy exist are then presented. It is also ...

متن کامل

Interactive Value Iteration for Markov Decision Processes with Unknown Rewards

To tackle the potentially hard task of defining the reward function in a Markov Decision Process, we propose a new approach, based on Value Iteration, which interweaves the elicitation and optimization phases. We assume that rewards whose numeric values are unknown can only be ordered, and that a tutor is present to help comparing sequences of rewards. We first show how the set of possible rewa...

متن کامل

A Simulation-Based Policy Iteration Algorithm for Average Cost Unichain Markov Decision Processes

In this paper, we propose a simulation-based policy iteration algorithm on Markov decision process (MDP) problems with average cost criterion under the unichain assumption, which is a weaker assumption than found in previous work. In this algorithm, 1) the problem is converted to a stochastic shortest path problem and a reference state can be chosen as any recurrent state under the current poli...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applicationes Mathematicae

سال: 1995

ISSN: 1233-7234,1730-6280

DOI: 10.4064/am-23-2-219-237